A novel term weighting scheme based on discrimination power obtained from past retrieval results
نویسندگان
چکیده
Term weighting for document ranking and retrieval has been an important research topic in information retrieval for decades. We propose a novel term weighting method based on a hypothesis that a term’s role in accumulated retrieval sessions in the past affects its general importance regardless. It utilizes availability of past retrieval results consisting of the queries that contain a particular term, retrieved documents, and their relevance judgments. A term’s evidential weight, as we propose in this paper, depends on the degree to which the mean frequency values for the relevant and non-relevant document distributions in the past are different. More precisely, it takes into account the rankings and similarity values of the relevant and non-relevant documents. Our experimental result using standard test collections shows that the proposed term weighting scheme improves conventional TF IDF and language model based schemes. It indicates that evidential term weights bring in a new aspect of term importance and complement the collection statistics based on TF IDF. We also show how the proposed term weighting scheme based on the notion of evidential weights are related to the well-known weighting schemes based on language modeling and probabilistic models. 2012 Elsevier Ltd. All rights reserved.
منابع مشابه
Image Retrieval Using Dynamic Weighting of Compressed High Level Features Framework with LER Matrix
In this article, a fabulous method for database retrieval is proposed.  The multi-resolution modified wavelet transform for each of image is computed and the standard deviation and average are utilized as the textural features. Then, the proposed modified bit-based color histogram and edge detectors were utilized to define the high level features. A feedback-based dynamic weighting of shap...
متن کاملDiscrimination of Power Quality Distorted Signals Based on Time-frequency Analysis and Probabilistic Neural Network
Recognition and classification of Power Quality Distorted Signals (PQDSs) in power systems is an essential duty. One of the noteworthy issues in Power Quality Analysis (PQA) is identification of distorted signals using an efficient scheme. This paper recommends a Time–Frequency Analysis (TFA), for extracting features, so-called "hybrid approach", using incorporation of Multi Resolution Analysis...
متن کاملA Novel Scheme for Improving Accuracy of KNN Classification Algorithm Based on the New Weighting Technique and Stepwise Feature Selection
K nearest neighbor algorithm is one of the most frequently used techniques in data mining for its integrity and performance. Though the KNN algorithm is highly effective in many cases, it has some essential deficiencies, which affects the classification accuracy of the algorithm. First, the effectiveness of the algorithm is affected by redundant and irrelevant features. Furthermore, this algori...
متن کاملA Learning-Based Term-Weighting Approach for Information Retrieval
One of the core components in information retrieval(IR) is the document-term-weighting scheme. In this paper,we will propose a novel learning-based term-weighting approach to improve the retrieval performance of vector space model in homogeneous collections. We first introduce a simple learning system to weighting the index terms of documents. Then, we deduce a formal computational approach acc...
متن کاملNovel Hierarchical Control of VSI-based Microgrids Against Large-Signal Disturbances
This paper provides a novel hierarchical control for VSI-based microgrids. The advantage of the provided control scheme is to maintain the frequency and voltage stability and load sharing against large-signal disturbances. A hierarchical control, consisting of three levels, is described. A new control loop based on PI controller, is presented. The new control loop has a great impact on increasi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Inf. Process. Manage.
 
دوره 48 شماره
صفحات -
تاریخ انتشار 2012